A Hierarchical Ensemble of Decision Trees Applied to Classifying Data from a Psychological Experiment

نویسنده

  • Yannick Lallement
چکیده

from a psychological experiment Yannick Lallement Human-Computer Interaction Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 [email protected] Abstract Classifying by hand complex data coming from psychology experiments can be a long and di cult task, because of the quantity of data to classify and the amount of training it may require. One way to alleviate this problem is to use machine learning techniques. We built a classi er based on decision trees that reproduces the classifying process used by two humans on a sample of data and that learns how to classify unseen data. The automatic classi er proved to be more accurate, more constant and much faster than classi cation by hand. Introduction Classi cation of complex data coming from psychological experiments is an important issue in cognitive psychology. Such classi cation is often done by two or more persons who subjectively rate the human behavior. This process can be long and labor-intensive, and there is often too much data to be classi ed by humans in a reasonable amount of time. Moreover, the psychology community considers the classi cation acceptable if the inter-rater reliability between the persons is at least 80%, which, given the variability of human data, often demands a lot of training. We encountered such a problem regarding the classi cation of human behavior over time. We built a learning classi er to make the classi cation faster and more reliable. In the following, we describe our data, the classi er we built, and the results we obtained. The KA-ATC c Task In the Kanfer-Ackerman Air Tra c Control c task (Ackerman & Kanfer 1994) is used to study problemsolving and learning in a dynamic environment. When performing this task, participants are presented with 0Copyright c 1998, American Association for Arti cial Intelligence (www.aaai.org). All rights reserved. .... FLT# TYPE SCHEDULED FUEL POS. TIME : 1:00 --------------------Score : 1000 -> 431 DC10 1:35 6 3 n Landing Pts: 1100 Penalty Pts: 100 268 747 1:45 6 3 s Runways : DRY 555 prop 1:50 6 3 e Wind : 10 20 knots from SOUTH 111 747 2:00 6 3 w 912 727 1:15 6 2 n Flts in Queue: 67 prop 1:05 6 2 s to accept 113 727 1:20 6 2 e 2 w 157 DC10 1:20 6 1 n 872 727 1:10 6 1 s 1 e 222 prop 1:00 5 1 w

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Ensemble machine learning on gene expression data for cancer classification.

Whole genome RNA expression studies permit systematic approaches to understanding the correlation between gene expression profiles to disease states or different developmental stages of a cell. Microarray analysis provides quantitative information about the complete transcription profile of cells that facilitate drug and therapeutics development, disease diagnosis, and understanding in the basi...

متن کامل

Using Imaginary Ensembles to Select GP Classifiers

When predictive modeling requires comprehensible models, most data miners will use specialized techniques producing rule sets or decision trees. This study, however, shows that genetically evolved decision trees may very well outperform the more specialized techniques. The proposed approach evolves a number of decision trees and then uses one of several suggested selection strategies to pick on...

متن کامل

بررسی کارایی مدل درختان تصمیم‌گیری در برآورد رسوبات معلق رودخانه‌ای (مطالعه موردی: حوضه سد ایلام)

The real estimation of the volume of sediments carried by rivers in water projects is very important. In fact, achieving the most important ways to calculate sediment discharge has been considered as the objective of the most research projects. Among these methods, the machine learning methods such as decision trees model (that are based on the principles of learning) can be presented. Decision...

متن کامل

DIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION

Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998